AITopics | orthogonal regularization

Collaborating Authors

orthogonal regularization

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

interpretation of regularization

Neural Information Processing SystemsApr-25-2026, 02:44:15 GMT

Blue arrows indicate node feature vectors hv of the latent space, and the orange area/point indicate possible range of graph feature vector hG obtained by applying READOUT to hv. We elaborate our motivation behind orthogonal regularization (15) proposed in Section 4.2.3. The biggest motivation behind orthognoal regularization lies in understanding (8) and (12) that the node features H becomes full rank matrix with good condition number. Figure 5 visually demonstrates the geometric effect of attention-based READOUT and orthogonal regularization with two example node features h1 and h2. Only one graph feature vector hG is possible from the combination of two node features with conventional READOUT, while vectors within the range of the orange rhombus can represent the whole graph feature with attention-based READOUT. With orthogonal regularization, area of the range that the graph feature vector hG can represent become even larger, with lower possibility of null subspace within H. Accordingly, the subspace that H can span can be rich enough.

artificial intelligence, machine learning, regularization, (15 more...)

Neural Information Processing Systems

Genre: Research Report (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

22785dd2577be2ce28ef79febe80db10-Supplemental.pdf

Neural Information Processing SystemsFeb-7-2026, 20:43:11 GMT

C.1 Ablationstudy Ablation study results are provided in Table 3.

artificial intelligence, machine learning, orthogonal regularization, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.54)

Add feedback

Expandable and Differentiable Dual Memories with Orthogonal Regularization for Exemplar-free Continual Learning

Moon, Hyung-Jun, Cho, Sung-Bae

arXiv.org Artificial IntelligenceNov-14-2025

Continual learning methods used to force neural networks to process sequential tasks in isolation, preventing them from leveraging useful inter-task relationships and causing them to repeatedly relearn similar features or overly differentiate them. To address this problem, we propose a fully differentiable, exemplar-free expandable method composed of two complementary memories: One learns common features that can be used across all tasks, and the other combines the shared features to learn discriminative characteristics unique to each sample. Both memories are differentiable so that the network can autonomously learn latent representations for each sample. For each task, the memory adjustment module adaptively prunes critical slots and minimally expands capacity to accommodate new concepts, and orthogonal regularization enforces geometric separation between preserved and newly learned memory components to prevent interference. Experiments on CIFAR-10, CIFAR-100, and Tiny-ImageNet show that the proposed method outperforms 14 state-of-the-art methods for class-incremental learning, achieving final accuracies of 55.13\%, 37.24\%, and 30.11\%, respectively. Additional analysis confirms that, through effective integration and utilization of knowledge, the proposed method can increase average performance across sequential tasks, and it produces feature extraction results closest to the upper bound, thus establishing a new milestone in continual learning.

artificial intelligence, learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2511.09871

Genre: Research Report > Promising Solution (0.34)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

propagation on the DAVIS dataset (Table 1), in comparison to a SOT A3 self-supervised method [49] and the ImageNet pre-trained representation

Neural Information Processing SystemsOct-2-2025, 04:20:47 GMT

Model J (Mean) Self-supervised, SOT A [49] 43.0 ImageNet Representation 49.4 Self-supervised, Ours 57.7 The shared affinity matrix bridges these tasks, and facilitates iterative improvements. These contributions are significant in the field of self-supervised learning. The contributions of this work are also demonstrated by our ablation study, i.e., Table 2 in the paper. We note that these components are novel and have not been explored in prior work. In the following, we address the other comments by reviewers.

artificial intelligence, machine learning, representation, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.98)

Add feedback

Tokenize features, enhancing tables: the FT-TABPFN model for tabular classification

Liu, Quangao, Yang, Wei, Liang, Chen, Pang, Longlong, Zou, Zhuozhang

arXiv.org Artificial IntelligenceJun-10-2024

Traditional methods for tabular classification usually rely on supervised learning from scratch, which requires extensive training data to determine model parameters. However, a novel approach called Prior-Data Fitted Networks (TabPFN) has changed this paradigm. TabPFN uses a 12-layer transformer trained on large synthetic datasets to learn universal tabular representations. This method enables fast and accurate predictions on new tasks with a single forward pass and no need for additional training. Although TabPFN has been successful on small datasets, it generally shows weaker performance when dealing with categorical features. To overcome this limitation, we propose FT-TabPFN, which is an enhanced version of TabPFN that includes a novel Feature Tokenization layer to better handle classification features. By fine-tuning it for downstream tasks, FT-TabPFN not only expands the functionality of the original model but also significantly improves its applicability and accuracy in tabular classification. Our full source code is available for community use and development.

categorical feature, dataset, feature identifier, (14 more...)

arXiv.org Artificial Intelligence

2406.06891

Country: Asia > China > Liaoning Province > Shenyang (0.06)

Genre: Research Report > Promising Solution (0.68)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Estimating Average Treatment Effects via Orthogonal Regularization

Hatt, Tobias, Feuerriegel, Stefan

arXiv.org Machine LearningJan-21-2021

Decision-making often requires accurate estimation of treatment effects from observational data. This is challenging as outcomes of alternative decisions are not observed and have to be estimated. Previous methods estimate outcomes based on unconfoundedness but neglect any constraints that unconfoundedness imposes on the outcomes. In this paper, we propose a novel regularization framework for estimating average treatment effects that exploits unconfoundedness. To this end, we formalize unconfoundedness as an orthogonality constraint, which ensures that the outcomes are orthogonal to the treatment assignment. This orthogonality constraint is then included in the loss function via a regularization. Based on our regularization framework, we develop deep orthogonal networks for unconfounded treatments (DONUT), which learn outcomes that are orthogonal to the treatment assignment. Using a variety of benchmark datasets for estimating average treatment effects, we demonstrate that DONUT outperforms the state-of-the-art substantially.

estimator, treatment effect, unconfoundedness, (12 more...)

arXiv.org Machine Learning

2101.0849

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Public Health (0.93)
Health & Medicine > Therapeutic Area (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Data Science (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Neural Photo Editing with Introspective Adversarial Networks

Brock, Andrew, Lim, Theodore, Ritchie, J. M., Weston, Nick

arXiv.org Machine LearningFeb-6-2017

The increasingly photorealistic sample quality of generative image models suggests their feasibility in applications beyond image generation. We present the Neural Photo Editor, an interface that leverages the power of generative neural networks to make large, semantically coherent changes to existing images. To tackle the challenge of achieving accurate reconstructions without loss of feature quality, we introduce the Introspective Adversarial Network, a novel hybridization of the VAE and GAN. Our model efficiently captures long-range dependencies through use of a computational block based on weight-shared dilated convolutions, and improves generalization performance with Orthogonal Regularization, a novel weight regularization method. We validate our contributions on CelebA, SVHN, and CIFAR-100, and produce samples and reconstructions with high visual fidelity.

artificial intelligence, machine learning, reconstruction, (14 more...)

arXiv.org Machine Learning

1609.07093

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Deep Multimodal Hashing with Orthogonal Regularization

Wang, Daixin (Tsinghua University) | Cui, Peng (Tsinghua University) | Ou, Mingdong (Tsinghua University) | Zhu, Wenwu (Tsinghua University)

AAAI ConferencesJul-15-2015

Hashing is an important method for performing efficient similarity search. With the explosive growth of multimodal data, how to learn hashing-based compact representations for multimodal data becomes highly non-trivial. Compared with shallow structured models, deep models present superiority in capturing multimodal correlations due to their high nonlinearity. However, in order to make the learned representation more accurate and compact, how to reduce the redundant information lying in the multimodal representations and incorporate different complexities of different modalities in the deep models is still an open problem. In this paper, we propose a novel deep multimodal hashing method, namely Deep Multimodal Hashing with Orthogonal Regularization (DMHOR), which fully exploits intra-modality and inter-modality correlations. In particular, to reduce redundant information, we impose orthogonal regularizer on the weighting matrices of the model, and theoretically prove that the learned representation is guaranteed to be approximately orthogonal. Moreover, we find that a better representation can be attained with different numbers of layers for different modalities, due to their different complexities. Comprehensive experiments on WIKI and NUS-WIDE, demonstrate a substantial gain of DMHOR compared with state-of-the-art methods.

different modality, modality, representation, (15 more...)

AAAI Conferences

Twenty-Fourth International Joint Conference on Artificial Intelligence

Country:

Asia > Singapore (0.04)
Asia > China > Beijing > Beijing (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Sensing and Signal Processing > Image Processing (0.68)

Add feedback